perm filename RALPH[1,RWF] blob sn#702800 filedate 1983-03-11 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002			 		EXTERNAL MEMORY
C00006 00003	Discs
C00016 ENDMK
CāŠ—;
		 		EXTERNAL MEMORY
Tapes

There are two kinds of of tapes at LOTS.  The older TU45 tapes have a
density of 800 or 1600 bytes per inch, and a speed of 75 inches per
second; this gives a maximum information transfer rate, ignoring
start/stop times and gaps, of 120000 bytes/sec.  At 5 bytes/word, this
implies a time of 42 microseconds/word.  Ordinarily, the inner loop of a
sorting algorithm will process data faster than the TU45 can deliver it.
So tape sorting algorithms will be limited entirely by tape speeds, and
sophisticated algorithms will be economical.

Information is stored in
records of at most 30,700 bytes per record, followed by a 0.6 inch gap;
crossing the gap takes 8000 microseconds, adding an average of 1.3
microseconds/word for data stored in long records.  Tapes are 2400 feet
long; this gives a capacity of 160 megabytes (32 million words).

The other kind of tape drive at LOTS is the TU78.  These
have a density of 1600 or 6250 bytes/inch and a speed of 125 inches/second.
This is equivalent to 780,000 bytes/second; 156,000 words/second; or
6.4 microseconds/word.  (Tapes as fast as 200 inches/second or 4 microseconds/word
exist.)   At  this speed,  even a  simple
sorting algorithm probably can't keep up  with the tape, and a fast  inner
loop becomes very important.

Both types of  tape can read  either forward or  backward, but only  write
forward.  After a write, all data to  the right of the new information  is
lost, so ordinarily one  thinks of writing  only at the  right end of  the
tape.  (Some tapes are not so restricted, e.g. Dectapes.)  Start/stop time
is ______________; it can be avoided if  the program can keep up with  the
tape (?)

[** the concept of "to the right" seems peculiar in this context **]
[** I don't know if the start/stop time can be avoided **]
[** I can't find the start/stop time; I think it's about 2000
    or 3000 microseconds **]
Discs

A disc memory can be seen on many levels.  As a mechanical device, it has a
certain set of  parameters (speed, capacity,  access times,  constraints).
When combined with its electronic  controller the parameters are  changed,
never for the better.  When  incorporated into a computer, the  parameters
may change again.  The  adoption of a  file-handling system and  operating
system changes them once more.   We shall look at  the LOTS discs at  each
level.
[** "never for the better" is arguably false.  In terms of raw performance,
    it may be right.  However, real discs have certain error characteristics
    which may be masked or corrected by the controller.  It's certainly
    the case that no controller maker has poor performance as a goal  **]

[** The following describes the older LOTS RP06 discs **]
[** The disc is formated for 128 words/sector = 4608 bits/sector.  On the
    disc, the term ``byte'' is meaningless. **]

At LOTS, the discs are of type RP06.  Each disc drive has 19 data
surfaces.  Each data surface is serviced by a single read/write head.  The
head is moved along a radial line; it can be placed at any of 800
positions along that line.  For any given position of a head, the area
that passes beneath the head is one track.  Each track is divided into 20
sectors of 128 words each.  The entire disc capacity is 128*20*19*800 =
38.9 million words.  (For comparison, the Palo Alto White Pages contain
about 160,000 entries averaging about 50 characters, or 1.6Mwords, 1/25 as
much.)

The 19 read/write heads are attached to one actuator that moves all the
heads at once.  The volume of data that can be accessed from one actuator
position is called a cylinder.

To read or write data, it is necessary to move the actuator to the 
proper cylinder, select the right head (or track), and wait for the
proper sector to come around.  Seek time is the time needed to move
the actuator; rotational latency is the time spent waiting for the
data to pass under the read/write head.

The seek time for moving from one cylinder to another is a complicated function
of the starting and ending cylinder numbers; it is zero if the
desired cylinder is the same as the current cylinder.  Otherwise it ranges from
than 3ms to 40ms depending on the distance and direction of movement;
the average is about 30 ms.
One can avoid seek times by staying within one cylinder; a cylinder
contains 19 x 2560 = 48,640 words.  Rotational latency varies from
zero to 16,667 ms (the rotational period).  In the absence of information
that otherwise specifies, the average latency is assumed to be half
the rotational period.  Rotational latency is avoided when reading
consecutive sectors.  The disk can also switch from reading (or
writing) the last sector of one track to the first sector of another
track (on the same cylinder) without having to skip one rotation.

[** Newer disks (the DEC RA81) have embedded servo which causes at least
one-sector of latency when switching heads (tracks).]

At the hardware level, then, a track contains 20 x 128 =
2560 words.  Data can be read, starting at any sector boundary,
at a rate of 60 x 2560 = 153,600 words/second; that is, 
6.510 microseconds/word.  This is fast enough to keep
up with most sorting algorithms.

[*** The operating system has both seek optimization, fairness, and forced
ordering of transfers.  Seek (and latency) optimization re-order requests
so they can be processed with minimal head movement (or rotational delay).
Fairness is invoked to prevent an infinite sequence of requests for one cylinder
from locking out other requests.   Forced ordering is applied
by the file system (as distinct from the code that deals with physical
devices) to particular transfers to effect file system data integrity.]

At the operating system level, a  programmer normally is sharing the  disc
drive with many other users; his disc requests find the discs and arms  in
almost random positions, except that central (i.e., mid-range) tracks
are most likely.   His requests for particular tracks will be  queued and processed by an  algorithm
which alternates going outward and inward much as an elevator alternates
handling all upward requests with all downward ones.  The operating
system organizes information into pages of four consecutive sectors (512 words);
all information transfer is  by  full  pages.
Successive pages of a file are usually, but not always, consecutive on the
disc.  When a track is reached, reading always starts at the first word of
the track. [** Incorrect.  Any sector may be read first.]

At the  machine/assembly language  level, disc  pages are  an  addressable
virtual memory.  Some file pages are resident in main memory, by whim of  the
operating system.   Nonresident  pages  may  be in  a  disc  file,  or  in
``swapping space'' on the disc. (Swapping space is allocated in the
middle range of cylinders to minimize seek time.)
A program can give the operating system advance warning of its need for
particular file pages.  The operating system reorders disc accesses
to optimize both seek time and rotational latency time.

At the Pascal level, a Pascal file is mapped into 32 consecutive pages  of
the virtual address space, which is then treated as described above.

Another type of disc unit is the DEC RP07,  with capacity of 500 Mbytes.
It has 629 cylinders, 32 tracks per cylinder, and 43 128-word sectors per track.
This is an example of a non-removable disc.
A newer type is the IBM 3380, with  2.5 Gbytes  of storage.   It  has  4
independently moving arms, each with its own set of cylinders; in  effect,
it is equivalent to 4 disc drives, each of 625 Mbyte.

The cost to purchase disc storage is about 25 Kbytes/dollar;  at this cost, it  is
economically feasible to permanently retain  all programs, text, and the
data produced by humans, buying more disc drives when old ones fill.
[**A fast typist can type 120 wpm = 720 bytes/minute = 86.4Mbytes/year = $3,456/year
in disk space.  Note that the cost of ownership of the disk is probably
about 40% of the purchase price per year.  In other words, a fast typist
costs an additional $1,400/year/year.  It is unclear if this is truly
economically feasible.]